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Abstract 

There  are  many  articles  discussing  the  solution  of  boundary  value  prob- 
lems on  various  parallel  machines.  The  solution  of  initial  value  problems 
does  not  lend  itself  to  parallelism,  since  in  this  case  one  uses  methods  that 
are  sequential  in  nature. 

Here  we  develop  a  parallel  scheme  for  initial  value  problems  based  on 
the  box  scheme  and  a  modified  recursive  doubling  technique. 

Fully  implicit  Runge  Kutta  methods  were  discussed  by  Jackson  and 
Norsett  (1986)  and  Lie  (1987).  Lie  assumes  that  each  processor  of  the 
parallel  computer  having  vector  capabilities. 

1      Introduction 

We  consider  the  solution  of  linear  initial  value  problems  on  a  hypercube.  "By  a 
hypercube  we  intend  a  distributed  memory  MIMD  computer  with  communica- 
tion between  processors  ...  via  a  communication  network  having  the  topology  of 
a  p-dimensional  cube,  with  the  vertices  considered  as  processors  and  the  edges 
as  communication  links"  (Keller  and  Nelson,  1987).   See  also  Fox  (1984,  1985, 


1987)  and  Fox  and  Otto  (1984).    Our  method  of  solution  is  based  on  the  box 
scheme  to  discretize  the  system  of  initial  value  problems 

y'         =  Ay  +  f{x) 

y{a)     =  Vo 
where  y  and  /  are  n-dimensional  vectors  and  A  is  an  n  x  n  matrix.  The  resulting 
system  of  equations  is  solved  by  a  modified  version  of  the  recursive  doubling 
technique  (see  Stone,  1973). 

In  the  next  section,  the  discretization  is  described  and  the  resulting  system 
of  equation  is  given.  Section  3  will  describe  the  modified  recursive  doubling 
technique  and  its  application  to  our  system. 

It  will  be  interesting  to  experiment  with  the  method  and  compare  the  results 
to  a  sequential  initial  value  solver  of  the  same  order. 

2      The  Single  Step  Method 

Consider  the  system  of  initial  value  problems 

y'         =A{x)y  +  f{x),  a<x<b 

y(o)    =  y'o 


(i) 


where 


and 


Let 


where 


y     =  (yi,---,yn)T,  /  =  {h{x),...jn(x))T , 
y'o   -  (yio>--->yno)> 

A  =  aij(x),  1  <  i,  j  <  n. 

Xj  -  a  +  jh,  j  =  0, 1, . . . ,  m                                     (2) 


ft=—  (3) 

m 


be  a  uniform  mesh.  The  box  scheme  (see  e.g.  Keller,  1976),  applied  to  (1)  yields 
yy+i     =  Vj  +  h\Aj  +  ±  iVi+i  +  Vj)  /2  +  /,+i}  . 


(4) 


yo       =  y'0 


where 


Ay+i      =  ^a+(^+-)/i 


1 


and  t/y  is  the  approximation  to  y(xj). 

Let  {ji,  i  =  l,2,...,s}  be  a  strictly  increasing  sequence  such  that  j\  >  0 
and  js  =  m.  We  shall  compute  the  solution  at  the  points  £,  =  Xjr  Let  <£,  be 
n  x  n  matrices  defined  for  each  i 

Ji-1       /  L  \   -1 


;=;.-i 


*<=  n  ('-^Vj)    (/+2^+i' 


«  =  1,2,...,»,        (5) 


where  j?o  =  0  and  h  is  sufficiently  small  so  that  /  —  -^  A,-+i   are  nonsingular. 
Similarly  let  the  n-vector  <pi  be 


('-i^-i)  ('+iVi)fc-i-*-. 


where 


and 


+fc'J-ivO"l/*-i 


yo  =  o 


(6) 


i  =  l,2,...,s, 


(7) 


y  =  o,...,y,  - yt_i  -  2. 

Then  it  can  be  easily  shown  as  in  Keller  and  Nelson  (1987),  that 


(8) 


**  =  *<yy<-i  +*?,-: 


1  =  1,2,. ..,«. 


(9) 


Remarks 

1.  The  matrices  to  be  inverted  are  of  order  n,  the  number  of  equations  in  the 
original  system  (1). 

2.  The  last  factor  in  the  product  defining  $,  is  the  matrix  required  in  com- 
puting (p{. 

3.  The  vector  yj.-i-j._l  can  be  computed  by  (7)  -  (8)  in  the  same  loop  one 
computes  $,  since  it  requires  the  same  matrices. 

3      Parallel  Evaluation 

To  solve  (9)  on  a  hypercube  with  p  =  s  processors,  one  can  modify  the  recursive 

doubling  technique  developed  by  Stone  (1973). 

Let 

h     =  $iy'o  +  <Pi 

(10) 
bj     ~  <Pj,  3  —  2,3,...,s 

and  let  Yt(j)  be  a  function  of  bj,bj-i, .  . . ,  6y-,  +  i ,  $y, . . .  ,  $y_t-+i .   Then  the  fol- 
lowing results  can  be  proved  using  similar  arguments  as  in  Stone(1973). 
Theorem.  Let  Y{(j)  satisfy  the  recurrence  relation 

Yi+1(j)  =  YiCfl  +  QjYiV  -  1),  iJ  >  1  (11) 

with  boundary  conditions 


(12) 


Y1{j)  =  bJ,    j>  1 

Yi{j)  =  0,      j<0ori<0. 
Then 
(i) 

Yt+s(j)  =  Ys(j)+      n     $2j-k-s+iYi{j  -  s)  (13) 

k=j-s+l 


(ii) 


;  ; 


W)  =  £<    II  *j-H.k+i\Yi{k)t 

k    I  I  ■   k .  I 


(14) 


(iii)  for 


Corollary 


«>J>1,  Yi{j)  =  yjr 


(15) 


n.C?)  =  nt?)  +       n    *«-*-<+i    W  -  0.     «,i  >  i         (16) 

This  corollary  provides  the  recursive  doubling  algorithm  for  the  solution  of 
(9).  Let 

r    i 


MiW) 


J<» 


Jfc=l 


fc=J-»+l 


then  (16)  can  be  written  as 

YmU)  =  YiU)  +  MiU)Yi{j  -  0    i,j  >  1 
Afw(/)  =  Mi(j)MiU  -  i)  i,j>  1 

with  boundary  conditions 

MlU)    =$;,        J>    1 

Mi(j)  =  /,      »<0ori<0. 

We  are  now  ready  to  state  the  algorithm. 
Algorithm 

For  t  =  1  to  s/2  in  steps  of  t  do: 

Y2i{j)  =  Yi{j)  +  Mi{j)Yi{j  -  0    i  <  j  <  e 
M2i{j)  =  Mt{j)Mt(j-t)  i<j<s 


(17) 


(18) 


(19) 


Next  i. 

From  our  theorem,  Ys(j)  =  yjt  for  1  <  j  <  s,  so  that  Y s  is  the  solution  of  (9). 
We  note  that  for  each  i,  the  indices  pertaining  to  j  are  executed  simultaneously 
on  s  processors.  Since  i  doubles  during  each  iteration,  log2  s  iterations  are 
required  for  computation. 
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